Skip to content

Conversation

@guettli
Copy link
Collaborator

@guettli guettli commented Nov 20, 2025

Before this PR roughly 3 of 10 make test-unit calls failed because of a flaky test. Many tests already contained the string "(flaky)" in the name.

The root-cause is that mocks and reconcilers are shared between tests.

All tests run sequentially (no concurrency). But changes done in one test could influence the second test.

Running tests in isolation (for example via FIt()) always succeeded.

This PR updates CreateNamespace() to ResetAndCreateNamespace().

Mocks get re-created via a Reset() method.

On my local device the tests were successful 70 times in a row (still running).


Introduces a Resetter/ResetAndCreateNamespace pattern for tests, to:

  • re-create mocks cleanly per test,
  • set a test namespace on reconcilers.

Resetting alone did not remove the flakiness. This PR ensures that mocks which have a state (like HCloudClient) do not get used, when a test has finished.

The background is explained in the kubebuilder docs: envTest, Namespace usage Limitations: Namespaces can't be deleted in envTests. This means objects from test-1 might still exist and getting reconciled while test-2 runs. When test-1 uses the stateful hcloudClient, although this test has already passed, it modifies the environment of test-2. Resetting does not resolve this.

This PR uses the pattern suggested by the kubebuilder docs: FooReconciler.Namespace. At the top of the Reconcile method the namespace gets checked. If set and different, reconcile is stopped.


The test for hbmm deletion was updated. It checks that the host gets deprovisioned. And the same test gets done in phase "ensure-provisioned".


HetznerBareMetalHostReconciler in "avoid conflict errors": Ignore NotFound error.


Closes #1685 #1506 #1435

Before the test did not realy test something
because the state is none at the beginning (before
provisioning), too.
@github-actions github-actions bot added size/M Denotes a PR that changes 50-200 lines, ignoring generated files. area/code Changes made in the code directory labels Nov 20, 2025
Copy link
Contributor

@janiskemper janiskemper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test doesn't test deletion anymore, but deprovisioning of the host when a hetznerbaremetalmachine is deleted. Can you explain why you change the test? Also, if you wanna do that change, you should update the test description

Background: All tests share the same reconciler and mock objects.

This means if test-1 changes the mocks, then test-2 could fail.

The method CreateNamespace() was changed to ResetAndCreateNamespace().

Flakiness should be reduced, because every test resets the mocks now.
@github-actions github-actions bot added size/L Denotes a PR that changes 200-800 lines, ignoring generated files. area/test Changes made in the test directory and removed size/M Denotes a PR that changes 50-200 lines, ignoring generated files. labels Nov 21, 2025
@guettli guettli changed the title 🌱 Fix hbmm delete unit tests. 🌱 Fix flaky unit-tests 🎉🎉🎉 Nov 21, 2025
@guettli guettli requested a review from janiskemper November 21, 2025 15:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/code Changes made in the code directory area/test Changes made in the test directory size/L Denotes a PR that changes 200-800 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Flaky Test: checks if PhaseWaiting is set when retryLimit is reached

4 participants